AITopics | pe block

Collaborating Authors

pe block

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

A Low-Power Streaming Speech Enhancement Accelerator For Edge Devices

Wu, Ci-Hao, Chang, Tian-Sheuan

arXiv.org Artificial IntelligenceMar-27-2025

--Transformer-based speech enhancement models yield impressive results. However, their heterogeneous and complex structure restricts model compression potential, resulting in greater complexity and reduced hardware efficiency. Additionally, these models are not tailored for streaming and low-power applications. Addressing these challenges, this paper proposes a low-power streaming speech enhancement accelerator through model and hardware optimization. The proposed high performance model is optimized for hardware execution with the co-design of model compression and target application, which reduces 93.9% of model size by the proposed domain-aware and streaming-aware pruning techniques. The required latency is further reduced with batch normalization-based transformers. Additionally, we employed softmax-free attention, complemented by an extra batch normalization, facilitating simpler hardware design. The tailored hardware accommodates these diverse computing patterns by breaking them down into element-wise multiplication and accumulation (MAC). This is achieved through a 1-D processing array, utilizing configurable SRAM addressing, thereby minimizing hardware complexities and simplifying zero skipping. This enhancement is crucial for various natural language processing (NLP) tasks, including speech recognition, machine translation, and hearing aids. Transformer-based speech enhancement models, such as [1], [2], have received significant attention in recent years due to their superior performance and parallel computing capabilities relative to other methods. The model shown in Figure 1 is heterogeneous, comprising an encoder and decoder that use convolutional neural networks (CNN) for speech extraction and restoration. In addition, it employs a masking module with transformers to filter out noise. However, its large model size and computational complexity become bottlenecks for low-power and real-time edge applications. This work was supported by the National Science and Technology Council, Taiwan, under Grant 111-2622-8-A49-018-SB, 110-2221-E-A49-148-MY3, and 110-2218-E-A49-015-MBK.

artificial intelligence, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2503.21335

Country:

Asia > Taiwan (0.25)
Europe > Belgium (0.04)

Genre: Research Report (1.00)

Industry:

Semiconductors & Electronics (1.00)
Health & Medicine > Therapeutic Area (0.54)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Recalibrating 3D ConvNets with Project & Excite

Rickmann, Anne-Marie, Roy, Abhijit Guha, Sarasua, Ignacio, Wachinger, Christian

arXiv.org Machine LearningFeb-25-2020

Fully Convolutional Neural Networks (F-CNNs) achieve state-of-the-art performance for segmentation tasks in computer vision and medical imaging. Recently, computational blocks termed squeeze and excitation (SE) have been introduced to recalibrate F-CNN feature maps both channel- and spatial-wise, boosting segmentation performance while only minimally increasing the model complexity. So far, the development of SE blocks has focused on 2D architectures. For volumetric medical images, however, 3D F-CNNs are a natural choice. In this article, we extend existing 2D recalibration methods to 3D and propose a generic compress-process-recalibrate pipeline for easy comparison of such blocks. We further introduce Project & Excite (PE) modules, customized for 3D networks. In contrast to existing modules, Project \& Excite does not perform global average pooling but compresses feature maps along different spatial dimensions of the tensor separately to retain more spatial information that is subsequently used in the excitation step. We evaluate the modules on two challenging tasks, whole-brain segmentation of MRI scans and whole-body segmentation of CT scans. We demonstrate that PE modules can be easily integrated into 3D F-CNNs, boosting performance up to 0.3 in Dice Score and outperforming 3D extensions of other recalibration blocks, while only marginally increasing the model complexity. Our code is publicly available on https://github.com/ai-med/squeeze_and_excitation .

architecture, pe block, segmentation, (14 more...)

arXiv.org Machine Learning

doi: 10.1109/TMI.2020.2972059

2002.10994

Country: Europe > Germany > North Rhine-Westphalia > Upper Bavaria > Munich (0.04)

Genre: Research Report (0.50)

Industry: Health & Medicine > Diagnostic Medicine > Imaging (1.00)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.48)

Add feedback